language model behavior
Improving Language Model Behavior by Training on a Curated Dataset
We've found we can improve language model behavior with respect to specific behavioral values by fine-tuning on a curated dataset of 100 examples of those values. We also found that this process becomes more effective as models get larger. While the technique is still nascent, we're looking for OpenAI API users who would like to try it out and are excited to find ways to use these and other techniques in production use cases. Our approach aims to give language model operators the tools to narrow this universal set of behaviors to a constrained set of values. While OpenAI provides guardrails and monitoring to ensure that model use-cases are compatible with our Charter, we view selecting the exact set of Charter-compatible values for the model as a choice that our users must face for their specific applications.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.83)
OpenAI claims to have mitigated bias and toxicity in GPT-3
In a study published today, OpenAI, the lab best known for its research on large language models, claims it's discovered a way to improve the "behavior" of language models with respect to ethical, moral, and societal values. The approach, OpenAI says, can give developers the tools to dictate the tone and personality of a model depending on the prompt that the model's given. Despite the potential of natural language models like GPT-3, many blockers exist. The models can't always answer math problems correctly or respond to questions without paraphrasing training data, and it's well-established that they amplify the biases in data on which they were trained. That's problematic in the language domain, because a portion of the data is often sourced from communities with pervasive gender, race, and religious prejudices.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.91)